Search CORE

University of East Anglia digital repository

SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access

Author: Antonio Salas
Christopher Phillips
DL Hartl
E Peacock
GA Thorisson
HM Cann
JC Long
JH Gillespie
Jorge Amigo
JZ Li
NA Rosenberg
NA Rosenberg
The International HapMap Consortium
Ángel Carracedo
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background In the last five years large online resources of human variability have appeared, notably HapMap, Perlegen and the CEPH foundation. These databases of genotypes with population information act as catalogues of human diversity, and are widely used as reference sources for population genetics studies. Although many useful conclusions may be extracted by querying databases individually, the lack of flexibility for combining data from within and between each database does not allow the calculation of key population variability statistics. Results We have developed a novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNPs) in widespread use in human population genetics: SPSmart (SNPs for Population Studies). A fast pipeline creates and maintains a data mart from the most commonly accessed databases of genotypes containing population information: data is mined, summarized into the standard statistical reference indices, and stored into a relational database that currently handles as many as 4 × 109 genotypes and that can be easily extended to new database initiatives. We have also built a web interface to the data mart that allows the browsing of underlying data indexed by population and the combining of populations, allowing intuitive and straightforward comparison of population groups. All the information served is optimized for web display, and most of the computations are already pre-processed in the data mart to speed up the data browsing and any computational treatment requested. Conclusion In practice, SPSmart allows populations to be combined into user-defined groups, while multiple databases can be accessed and compared in a few simple steps from a single query. It performs the queries rapidly and gives straightforward graphical summaries of SNP population variability through visual inspection of allele frequencies outlined in standard pie-chart format. In addition, full numerical description of the data is output in statistical results panels that include common population genetics metrics such as heterozygosity, <it>Fst </it>and <it>In</it>.</p

Springer - Publisher Connector

Genomic microsatellites identify shared Jewish ancestry intermediate between Middle Eastern and European populations

Author: A Abu
A Amar
A Chouraqui
A Nebel
A Nebel
A Zoossmann-Diskin
AB Olshen
AC Need
AL Price
AN Poliak
B Le Roux
C Tian
Chaolong Wang
D Carmelli
D Falush
DM Behar
DM Dunlop
Dov Gefel
E Kobyliansky
E Kobyliansky
E Minch
EM Wijsman
F Cailliez
F Prugnolle
G Livshits
H Ostrer
HM Cann
IG Romero
J Corander
J Corander
J Felsenstein
J Novembre
JK Pritchard
JL Mountain
Jossi Hillel
JP Huelsenbeck
JS Friedlaender
JZ Li
K Skorecki
KA Brook
L Jin
Lewi Stone
LL Cavalli-Sforza
M Bauchet
M Boehnke
M Jakobsson
M Jakobsson
M Nei
M Nei
Marcus W Feldman
MF Hammer
MF Seldin
MG Thomas
MG Thomas
MG Thomas
MP Epstein
N Takezaki
N Takezaki
NA Rosenberg
NA Rosenberg
NA Rosenberg
NA Rosenberg
NA Rosenberg
NA Rosenberg
Naama M Kopelman
NE Morton
NH Timm
Noah A Rosenberg
R Patai
RM Goodman
S Karlin
S Ramachandran
S Sand
S Shringarpure
S Wang
SA Tishkoff
TC Falik-Zaccai
ZA Szpiech
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Genetic studies have often produced conflicting results on the question of whether distant Jewish populations in different geographic locations share greater genetic similarity to each other or instead, to nearby non-Jewish populations. We perform a genome-wide population-genetic study of Jewish populations, analyzing 678 autosomal microsatellite loci in 78 individuals from four Jewish groups together with similar data on 321 individuals from 12 non-Jewish Middle Eastern and European populations. Results We find that the Jewish populations show a high level of genetic similarity to each other, clustering together in several types of analysis of population structure. Further, Bayesian clustering, neighbor-joining trees, and multidimensional scaling place the Jewish populations as intermediate between the non-Jewish Middle Eastern and European populations. Conclusion These results support the view that the Jewish populations largely share a common Middle Eastern ancestry and that over their history they have undergone varying degrees of admixture with non-Jewish populations of European descent.</p

Springer - Publisher Connector

Deep Blue Documents at the University of Michigan

RMIT Research Repository

ScholarBank@NUS

Straightforward Inference of Ancestry and Admixture Proportions through Ancestry-Informative Insertion Deletion Multiplexing

Author: A Tandon
AL Price
AL Price
António Amorim
B Devlin
C Alkan
C Phillips
C Phillips
C Tian
Carla Santos
Christopher Phillips
D Falush
D Luca
DF Conrad
ER Londin
HM Cann
I Halder
J Marchini
JK Pritchard
JK Pritchard
JL Weber
JM Mullaney
JN Hirschhorn
JN Hirschhorn
JR Gonzalez
JR Kidd
L Excoffier
Leonor Gusmão
M Jakobsson
M Kayser
Manfred Kayser
MD Shriver
MF Seldin
N Yang
NA Rosenberg
NA Rosenberg
NA Rosenberg
NA Rosenberg
NA Rosenberg
NP Santos
Nádia Pinto
O Lao
O Lao
P Donnelly
P Kersbergen
P Paschou
PH Sudmant
R Kosoy
R Nassir
R Pereira
R Pereira
R Redon
RE Mills
RE Mills
Rui Pereira
S Biswas
Sidney Emanuel Batista dos Santos
T Frudakis
Ángel Carracedo
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Ancestry-informative markers (AIMs) show high allele frequency divergence between different ancestral or geographically distant populations. These genetic markers are especially useful in inferring the likely ancestral origin of an individual or estimating the apportionment of ancestry components in admixed individuals or populations. The study of AIMs is of great interest in clinical genetics research, particularly to detect and correct for population substructure effects in case-control association studies, but also in population and forensic genetics studies

Public Library of Science (PLOS)

Repositorio Institucional da Universidade de Santiago de Compostela

FigShare

Effective selection of informative SNPs and classification on the HapMap genotype data

Author: A Gusev
B Halldrsson
B Wu
E Halperin
I Guyon
I Levner
J Devore
J Jaeger
J Park
L Wang
Lipo Wang
LP Wang
LP Wang
M Stephens
NA Rosenberg
NA Rosenberg
Nina Zhou
R Tibshirani
S Wright
TM Phuong
V Bafna
V Vapnik
WM Trochim
Y Su
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Since the single nucleotide polymorphisms (SNPs) are genetic variations which determine the difference between any two unrelated individuals, the SNPs can be used to identify the correct source population of an individual. For efficient population identification with the HapMap genotype data, as few informative SNPs as possible are required from the original 4 million SNPs. Recently, Park <it>et al.</it> (2006) adopted the nearest shrunken centroid method to classify the three populations, i.e., Utah residents with ancestry from Northern and Western Europe (CEU), Yoruba in Ibadan, Nigeria in West Africa (YRI), and Han Chinese in Beijing together with Japanese in Tokyo (CHB+JPT), from which 100,736 SNPs were obtained and the top 82 SNPs could completely classify the three populations. Results In this paper, we propose to first rank each feature (SNP) using a ranking measure, i.e., a modified t-test or F-statistics. Then from the ranking list, we form different feature subsets by sequentially choosing different numbers of features (e.g., 1, 2, 3, ..., 100.) with top ranking values, train and test them by a classifier, e.g., the support vector machine (SVM), thereby finding one subset which has the highest classification accuracy. Compared to the classification method of Park <it>et al.</it>, we obtain a better result, i.e., good classification of the 3 populations using on average 64 SNPs. Conclusion Experimental results show that the both of the modified t-test and F-statistics method are very effective in ranking SNPs about their classification capabilities. Combined with the SVM classifier, a desirable feature subset (with the minimum size and most informativeness) can be quickly found in the greedy manner after ranking all SNPs. Our method is able to identify a very small number of important SNPs that can determine the populations of individuals.</p

DR-NTU (Digital Repository of NTU)

Heterozygosity increases microsatellite mutation rate, linking it to demographic history

Author: A Di Rienzo
A Manica
A Purvis
AR Rogers
B Linz
DB Goldstein
F Prugnolle
H Ellegren
H Ellegren
H Ellegren
H Liu
J Swinton
JC Garza
JL Weber
JL Weber
Jonathan Flint
JZ Li
L Jin
LA Zhivotovsky
MG Hanford
MM Mahtani
NA Rosenberg
NA Rosenberg
NA Yamada
NA Yamada
Q-Y Huang
R Kolodner
R van Treuren
S Ramachandran
W Amos
W Amos
William Amos
X Xu
X Xu
Xin Xu
Y Vigouroux
Y Zhu
YD Kelkar
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Biochemical experiments in yeast suggest a possible mechanism that would cause heterozygous sites to mutate faster than equivalent homozygous sites. If such a process operates, it could undermine a key assumption at the core of population genetic theory, namely that mutation rate and population size are indpendent, because population expansion would increase heterozygosity that in turn would increase mutation rate. Here we test this hypothesis using both direct counting of microsatellite mutations in human pedigrees and an analysis of the relationship between microsatellite length and patterns of demographically-induced variation in heterozygosity. Results We find that microsatellite alleles of any given length are more likely to mutate when their homologue is unusually different in length. Furthermore, microsatellite lengths in human populations do not vary randomly, but instead exhibit highly predictable trends with both distance from Africa, a surrogate measure of genome-wide heterozygosity, and modern population size. This predictability remains even after statistically controlling for non-independence due to shared ancestry among populations. Conclusion Our results reveal patterns that are unexpected under classical population genetic theory, where no mechanism exists capable of linking allele length to extrinsic variables such as geography or population size. However, the predictability of microsatellite length is consistent with heterozygote instability and suggest that this has an important impact on microsatellite evolution. Whether similar processes impact on single nucleotide polymorphisms remains unclear.</p

Harvard University - DASH

Springer - Publisher Connector

Oxford University Research Archive

An analytical upper bound on the number of loci required for all splits of a species tree to appear in a set of gene trees

Author: B Rannala
C Ané
C Than
C-I Wu
CG Schrago
E Milot
E Mossel
EM Jewett
ES Allman
ES Allman
F Bokma
G Dasarathy
J Heled
JA Rice
JH Degnan
JH Degnan
JH Degnan
JH Degnan
JH Degnan
JH Degnan
L Liu
L Liu
L Liu
L Liu
Lawrence H. Uricchio
M DeGiorgio
MT Hallett
NA Rosenberg
NA Rosenberg
NA Rosenberg
Noah A. Rosenberg
P Pamilo
PJ Cock
R Mehta
S Mirarab
S Mirarab
S Roch
S Tavaré
T Stadler
T Stadler
Tandy Warnow
Y Wu
Y Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Public Library of Science (PLOS)

Geographical Affinities of the HapMap Samples

Author: BE Stranger
BF Voight
Chris Tyler-Smith
DF Conrad
GT Powell
HM Cann
Jane Gitschier
JB Veyrieras
JC Mueller
JJ Mulero
JK Pritchard
JZ Li
KA Frazer
L Deng
L Roewer
LB Barreiro
M Jakobsson
MA Jobling
MA Jobling
MI McCarthy
Miao He
NA Rosenberg
NA Rosenberg
O Lao
O Semino
PC Sabeti
Peter de Knijff
Philip Awadalla
R Redon
S Willuweit
Tatiana Zerjal
Y Xue
Y Yamaguchi-Kabata
Yali Xue
ZH Rosser
Publication venue: Public Library of Science
Publication date: 04/03/2009
Field of study

The HapMap samples were collected for medical-genetic studies, but are also widely used in population-genetic and evolutionary investigations. Yet the ascertainment of the samples differs from most population-genetic studies which collect individuals who live in the same local region as their ancestors. What effects could this non-standard ascertainment have on the interpretation of HapMap results?We compared the HapMap samples with more conventionally-ascertained samples used in population- and forensic-genetic studies, including the HGDP-CEPH panel, making use of published genome-wide autosomal SNP data and Y-STR haplotypes, as well as producing new Y-STR data. We found that the HapMap samples were representative of their broad geographical regions of ancestry according to all tests applied. The YRI and JPT were indistinguishable from independent samples of Yoruba and Japanese in all ways investigated. However, both the CHB and the CEU were distinguishable from all other HGDP-CEPH populations with autosomal markers, and both showed Y-STR similarities to unusually large numbers of populations, perhaps reflecting their admixed origins.The CHB and JPT are readily distinguished from one another with both autosomal and Y-chromosomal markers, and results obtained after combining them into a single sample should be interpreted with caution. The CEU are better described as being of Western European ancestry than of Northern European ancestry as often reported. Both the CHB and CEU show subtle but detectable signs of admixture. Thus the YRI and JPT samples are well-suited to standard population-genetic studies, but the CHB and CEU less so

Positive Selection in East Asians for an EDAR Allele that Enhances NF-κB Activation

Author: A Fischer
A Franbourg
A Fujimoto
A Kumar
BF Voight
BF Voight
CS Carlson
David Hughes
DF Conrad
DJ Headon
Emilie Hardouin
HL Norton
HM Cann
I Thesleff
Irina Pugach
J Hey
Jarosław Bryk
Jason E. Stajich
JL Kelley
JM Akey
JP Pollinger
JS Friedlaender
JZ Li
K Tang
KR Thornton
L Frisse
LB Barreiro
M Przeworski
M Soejima
M Yan
MA Beaumont
Mark Stoneking
MR Waters
N Chassaing
N Izagirre
NA Rosenberg
NA Rosenberg
NA Rosenberg
NJR Fagundes
O Lao
P Koppinen
PC Sabeti
Rainer Strotmann
RC Edgar
RL Lamason
S Myles
S Myles
S Myles
S Wright
SA Tishkoff
SE Ptak
Sean Myles
SH Williamson
T Bersaglieri
VA Botchkarev
Y Shimomura
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Genome-wide scans for positive selection in humans provide a promising approach to establish links between genetic variants and adaptive phenotypes. From this approach, lists of hundreds of candidate genomic regions for positive selection have been assembled. These candidate regions are expected to contain variants that contribute to adaptive phenotypes, but few of these regions have been associated with phenotypic effects. Here we present evidence that a derived nonsynonymous substitution (370A) in EDAR, a gene involved in ectodermal development, was driven to high frequency in East Asia by positive selection prior to 10,000 years ago. With an in vitro transfection assay, we demonstrate that 370A enhances NF-κB activity. Our results suggest that 370A is a positively selected functional genetic variant that underlies an adaptive human phenotype

Bournemouth University Research Online

MPG.PuRe

University of Huddersfield Repository

Explore Bristol Research

Recommended from our members

Bias Blind Spot: Structure, Measurement, and Consequences

Author: Aiken LS
Baron J
Carey K. Morewedge
Cohen J
Erin McCormick
H. Lauren Min
Irene Scopelliti
John OP
Karim S. Kassam
Macmillan NA
Nisbett RE
Nunnally JC
Paolacci G
Rosenberg M
Ross L
Sophie Lebrecht
Stanovich KE
Symborski C
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 01/10/2015
Field of study

People exhibit a bias blind spot: they are less likely to detect bias in themselves than in others. We report the development and validation of an instrument to measure individual differences in the propensity to exhibit the bias blind spot that is unidimensional, internally consistent, has high test-retest reliability, and is discriminated from measures of intelligence, decision making ability, and personality traits related to self-esteem, self-enhancement, and self-presentation. The scale is predictive of the extent to which people judge their abilities to be better-than-average for easy tasks and worse-than-average for difficult tasks, ignore the advice of others, and are responsive to an intervention designed to mitigate a different judgmental bias. These results suggest that the bias blind spot is a distinct metabias resulting from naïve realism rather than other forms of egocentric cognition, and has unique effects on judgment and behavior

City Research Online